Individual integration of sensory and semantic features in discriminal object recognition space
نویسندگان
چکیده
331.9 from Society for Neuroscience Abstracts 1993 Poster at SfN Annual Meeting, Washington DC, November 1993 ABSTRACT for Talk 1 at CSBBCS/EPSfor Talk 1 at CSBBCS/EPS Talk 1 at CSBBCS/EPS, U Toronto, July 1993 Individual integration of sensory and semantic features in discriminal object recognition space RICHARD FREEMAN WITH DAVID BOOTH Perceptible aspects of objects and situations that are of sufficient interest to a culture acquire names or a descriptive vocabulary. Some aspects may be sensed without conceptual mediation. For others, some semantic processing is a prerequisite for identification or characterisation. Knowledge-contentful and knowledge-free descriptors of features of objects and situations may behave the same way grammatically but they seem to refer to very different kinds of percept. As a result, psychologists have divided Perception between the psychophysics of material qualities – ‘sensory’ features and the recognition of meaningful attributes – ‘semantic’ features. Yet this distinction may not be as unmanageable as it sometimes seems. For example, multidimensional scaling applies indifferently to “sensory” and “general knowledge” categories as Medin and Barsalou called them in Steven Harnad’s book “Categorical Perception.” That may be because MDS is ‘non-metric’ and so side-steps the problems of scaling within and across material and symbolic parameters. This talk introduces a way of scaling all sensed material features and attributed semantic features on the same unit, thereby allowing measurement of causal interactions between the processes transmitting information through the mind about features of all types. First though, for comparison, let’s consider an example of the use of MDS to include semantic and sensory descriptors within one statistical model. The data were collected from a consumer panel of 80 individuals who assessed eight orange-flavoured drinks that were varied in levels of sucrose (table sugar) and citric acid, a major sour component of oranges. The two largest components from MDS of the whole panel’s data are plotted in the graph below (over the page). This statistical consensus model placed the drinks in positions relative to the descriptors “sour” and “sweet” that are appropriate to their contents of sugar and acid. The sample (S1A3) that was highest in acid (level coded “A3”) and lowest in sugar (“S1”) is closest to sourness and is far from the samples tested that were highest in sugar (S7A3) or lowest in acid (S2A1) on both the vertical and horizontal components of the MDS solution. This psychometric space successfully accommodated both the sensory concepts ‘sour’ and ‘sweet’ and the semantically more complex concepts ‘risk to health’ and ‘calories.’ Furthermore, the semantic category, ‘calories’ is close to the sensory category, sweetness (see bottom right of plot, over the page), as well as to the concept of risk to health, presumably from the high sugar level (S6) to which strong sweetness is attributed. This statistical patterning of responses, however, is ambiguous as to the status of a category, between sensory and semantic. This is evident from the placing of the ratings of the drinks for “thirst quenching” (at the top of the plot, above). Arguably this is the appropriate position for the sensory category of an orange-like balance between moderately strong sweetness and considerable sourness. However, equally, the position could represent ‘general knowledge’ from experiences of thirst being quenched by oranges or orange juice that taste strongly. The psychophysical approach to which we now turn can resolve such ambiguities We used the same orange-flavoured drink in our first attempts to use discriminal integration psychophysics to measure interactions between sensory and semantic features. The sensory feature was the taste of a sweetener (table sugar). The semantic feature was verbal information symbolising the energy contents of sweeteners used in drinks. The design is illustrated by the raw data (next page) from one of the 145 assessors who were familiar with this drink on vending machines. Two factors were varied independently among eight samples of the drink presented to each assessor. The sensory factor was the concentration of sugar in the sample (on the horizontal axis of each graph, in the equal-ratio scaling of log molarity). The semantic factor was the calories implied by a label alongside each sample. Component 1 C o m p o n e n t 2 The label (stated beside each line of data) was either “Sugar” or “NutraSweet,” described as “a low-calorie sweetener” (as on the jars of the powder). As you can see from the graphs, each label was presented on drinks with the same four levels of sucrose, the only sweetener used in this experiment. The assessors had to make three quantitative judgments on each drink. First, they rated the sweetness of the sample relative to the sweetness of their usual orange drink, by marking a point on a horizontal line similar to the vertical axis in the uppermost graph on the previous page. Next, they estimated the relative energy content of the sample drink, using a horizontal version of the vertical axis in the middle graph above. Immediately we can see an advantage of a psychophysical approach over a psychomteric approach from the upper two graphs. Sweetness and calories were not differentiated by the MDS model shown at the start of the talk. Yet sweetness and calories gave very different pairs of psychophysical functions, showing very strongly the expected effect of the contrast in semantics between the word “sugar” and the brand name of an artificial sweetener. Finally, the assessors rated their disposition to choose each particular drink, from “ALWAYS” to “NEVER.” These ratings can be ‘unfolded’ (Coombs, 1964), as shown in the graph at the bottom above, by differentiating too much sugar from too little sugar. This produces a monotonic relationship between levels of sugar and ratings of preference, rather than an inverted V. This individual’s data are interpretable in quite a complex way by comparing the shapes of the six psychophysical functions. Reading horizontally across the data in the bottom graph on the previous page, the ideal level of sweetness (provided by sucrose) was considerably higher when the taste was attributed to the low-calorie sweetener than when it was attributed to ‘full calories’ sugar. Only a dislike of sweetness perceived to be much greater than usual (top graph) suppressed this liking for sweetness (right-hand end of bottom graph) when the drink was believed to be free of calories (middle graph). Clearly, if we could measure cognitive interactions among the psychophysical functions, such interpretation might be made much more securely. Therefore we’ll spend the rest of this talk explaining how we measure sensory-semantic interactions using the data from another assessor, one who professed always to use “diet drinks.” The key innovation is to scale both sensory and semantic stimuli on the same metric, using the traditional measure of discrimination performance, known in subjectivist terminology as the “just-noticeable difference” (JND). Instead of assuming that the descriptive ratings measure differences in strength of private sensations, we rely only on the objective fact that they are sensitive to disparities in quantity in the presented stimuli, be it material or symbolic. More deta are needed to address the question whether this objective discrimination is achieved phenomenologically, verbally, neurally or by some other mode of processing: we tackle such issues after we have measured the discriminative performance of each of the six response-stimulus relationships, using the conventional Case V of Thurstone (1927) for a “discriminal process.” Just as in the estimation of a JND, what we call the half-discriminated disparity (HDD) in levels of a stimulus, between s0 and s1 in the graph above, is at 50% overlap between the probability distributions of the responses at each level, estimated from the mean square error of least-squares linear regression through the raw data from an individual within a session. That is, the 75% quantile of the estimated distribution of responses to the lower level of stimulus are superposed on the 25% quantile for the higher level. (The graph as drawn above does not have enough overlap.) Thus the formula for the HDD (JND) is the square root of the mean square error around the regression line divided by the line’s slope, multiplied by twice the z value for 25% (0.675) Stimulus levels The assumptions of linearity and constant residual variance are testable on data (and have held up well over many experiments in our lab.). As long as least-squares linear regression computes, very few data are needed to estimate an HDD, and also to interpolate a value for the ‘norm’ of that stimulus for that person in that session – whatever is the implicit usual value, ideal point or ‘template’ that the ratings were anchored on by the assessor in that context. Of course, both estimates go down in reliability with smaller numbers of data. Nevertheless, the orderliness of merely four data-pairs, as in this experiment, can be monitored as variance the regression accounts for. This assessor’s sugar-sweetness functions for “NutraSweet”-labelled drinks and “Sugar” drinks are given below. Unsurprisingly for a user of Diet drinks, the function is more reliable for the samples believed to be low in calories (r 2 = 0.90) than for the declared sugar-containing drinks (r 2 = 0.60). concentration of sucrose (log10 M) The recognition points (RP), i.e. the norms for “usual sweetness,” were similar for the two labels. The slope was lower for “sugar”-labelled samples as well as the MSE being larger (smaller r 2 ) and so the HDD was considerably worse (higher). “DT” (discrimination threshold) in the graphs is the Weber Fraction (HDD 1). This HDD of 1.12 is close to the limit for sugars in water and so this person’s sucrose discrimination performance by normed sweetness ratings appears to be very good with the desired low-calorie sweetener label, albeit subject to the wide confidence limits on regression through only four data. Exactly the same calculations were performed on the sweetener calories feature. Two unquantified levels of a category can be discriminally scaled but in this case the “low-calorie sweetener” brand is declared on the jar to have an energy content 10% that of sugar (100%) and so a quantitative function was calculated for the two levels of caloric label. We can now scale the stimulus values for each of these sweetness functions (and the calories and choice function) in units of HDDs from norm (RP). Each function is a piece of evidence on the causal processes in the mind of the assessor during the session, i.e. the cognitive mechanisms of the decisions made on sweetness, calories and likelihood of choice. Fine discrimination by a rating is the same thing as strong control of the rating by the stimulus. We can now produce the psychophysical advance on the psychometric modelling illustrated at the start. This time the two-way graph visualises determinate formulae for a person’s interacting mental mechanisms, instead of the statistical fitting of grouped responses into a minimally stressed map. L a b e l le v e ls in H D D s ro m n o rm (z e ro c a lo ie s ) Sugar levels in HDDs from norm (zero = ideally sweet level) Each point (X) on this HDD-scaled graph of the two stimulus features (looking down on the base of a cone from its apex) is a sample of the drink as perceived by the assessor’s ratings of sweetness (x axis) or calories (z axis). The plot shows that all the samples were too calorific for this diet drink user, although two only slightly so (labelled “NutraSweet”). One of the samples was definitely too sweet (~1.2 HDDs above ideal, or recognition point, RP), while three samples were far from sweet enough (>>1 HDD below ideal). The right-angled triangle drawn on this stimulus graph shows two possible sorts of interaction between the distances of sugar levels and calorie levels from ideal. If the taste of sugar and the meaning of “calories” combine as two distinct features, the distance in HDDs of a sample from the joint ideal point (0, 0) is the length of the hypotenuse of that triangle (2D on the graph). If, on the other hand, preference (say) treats sweetness and calories as the same, the combination puts the two distances end-to-end, i.e. adds them together (1D). Note that the signs of HDDs from norm are retained in 1D models and so the sum may be a subtraction and can give a positive or negative value, whereas 2D models are unsigned (or always positive) because values are squared before summing and then taking the square root in accord with Pythagoras’s Theorem. Two-dimensional integration gives a moderately good account of this diet-drink user’s choice ratings (scatterplot below), with r 2 = 0.64. In contrast, output from 1D modelling (on the next page) is effectively nonsensical, forming an L-shaped function with a slope largely attributable to a single sample of the drink and accounting for only a third of the variance. Hence in this person, the sweet taste and the calorie label are perceptually distinct – and indeed could be sensory and semantic features respectively. Hence, discrimination psychophysics resolves the cognitive processes in choice, whereas psychometric response patterning puts sweetness and calories close together in both of the first two components (see the first graph in this talk). Obviously, more than three sets of eight data-pairs are needed to draw reliable conclusions about an individual in a situation. We are pursuing issues of replication within individuals. Nevetheless, we can test for reliability to some extent by looking for systematic effects across independently characterised assessors within this first experiment by itself. The amount of variance in rated choice accounted for by 2D and 1D models showed a reliable difference between diet-drink users and sugar-drink users. In two-way ANOVA, 2D did better than 1D (p < 0.05) but there was no reliable difference between the two sorts of habit. The interesting result was a reliable interaction effect (p < 0.05), with 2D accounting for more variance in preference than 1D in users of low-calorie drinks and the other way round in users of sugar drinks. The data need examining more closely but there is a fairly straightforward interpretation of this contrast between the groups. The users of diet drinks are personally very familiar with ‘uncoupling’ of sweetness and calories: the taste is a sensed material characteristic but the construct of calories is the key part of a highly salient belief about a drink. Hence the ‘2D’ decision-making processes evident in the one assessor above may be quite common among users of low-calorie drinks. In contrast, those who habitually opt for sugar-containing drinks may treat sweetness as meaning the same thing as the (high) calorie label – a symbol conveying the amount of energy in the drink which they desire, maybe to kill off hunger or to boost “energy” in the sense of bodily and/or mental vigour. To conclude, the new approach starts with complete analysis of the data from each individual and postpones the modelling of the group (and subgroups) until that can be done on the performance characteristics of the individuals. In MDS, individuals can be plotted as vectors. Nevertheless the model forces a consensus across the panel in the first few components. More to the point, that approach is incapable of measuring what is going on in each individual’s mind. Also the new approach uses all of the data, not just response values but stimulus values as well. That is, the approach is psychophysical, operating on manifest variables to derive evidence of underlying causation, in contrast to psychometrics that models patterns in the responses and is content to recover information about stimuli as latent variables and data-structures of origins that are unspecified from data. Finally, this approach is mathematically fully determinate, with no loose parameters. Even the estimation of each elemental psychophysical function from the raw data from an individual in a session by a least-squares statistic is in fact algebraically determinate. This all contrasts with MDS and other psychometric modelling that improve the fit to data by varying weights etc. Annotated Bibliography (April 2010) The theory generalising normed discrimination scaling to multiple features was published in the same year as these talks. The extensions from multisensory integration to multiconceptual and sensory-conceptual interactions were also made in that paper. Booth, D.A., & Freeman, R.P.J. (1993). Discriminative feature integration by individuals. Acta Psychologica 84, 1-16 The full analysis in accord with Booth and Freeman (1993) that was made at the time of the data from the second experiment on orange drink above has now been published in brief. Freeman, R.P.J., & Booth, D.A. (2010). Users of ‘diet’ drinks who think that sweetness is calories. Appetite 55, in press. ‘[Scientific] Postscript’ The graph above of two stimulus features (on x and z axes) was described as a view down from the apex of a cone onto its base. The vertical dimension of that cone (the y axis) is the response that is cognitively integrated from the two features. Pythagoras’s Theorem generalises to any number of dimensions and so the theory includes an unvisualisable ‘hypercone’ involving three or more stimulus features (Booth & Freeman, 1993). The same theory of multiple-featured objects and situations improves on the linear unfolding of choice ratings and other responses peaked on levels of the stimulus, used above and in normed discriminal scaling of a single sensory feature in a familiar context by Mark Conner with David Booth, from which the Booth/Freeman approach was generalised. Observed data are theoretically on the surface of the cone at a vertical cut parallel to the axis of the stimulus being varied. If this is the x axis, then the z axis can be considered to be an integrated stimulus of all the other features in the familiar context. When the assessed sample is perfect in all respects except the varied feature, then the data fall on the isosceles triangle through the apex of the cone. When however there is some ‘defect’ in the context (such as a waterclear or red-coloured orange-flavoured drink), then the vertical cut is some way from zero on the z axis and the data fall on a conic section (right hyperbola) – as in the graph below. (This quantitative theory of ‘contextual defects’ in data on sensory acceptance was presented in a plenary session of the first Pangborn Symposium, held in Finland in August 1992.) A contextual defect in integrative ratings (y axis) of a single feature (x axis) with the joint norm being at zero on the z axis (going into and out of the page). dS: discrimination-scaled levels of the varied feature, S. dC: contextual defect measured in units of strength of response. The intersection of the horizontal tangent of the peak of the conic section (continuous line) with the bounding isosceles triangle (broken line) measures the contextual defect in stimulus units, i.e. in HDDs from Norm. (Figure from Booth & Freeman, 1993) Talk 2 at CSBBCS/EPS Cognitive processes of recognising an object by two of its features e.g., personal acceptance norms (ideal points) for the levels of sugar and of cream in an ice-cream or in a cup of sweet milky coffee A person’s overall liking for an ice-cream or a cup of sweet milky coffee may be based on awareness (sensations) of its sweetness and creaminess. In this talk, we call that sort of mental processing “PERceiving.” Here is a diagram of the information-transforming channels involved. Note that the integrative decisions come after the states of experiencing. Sometimes, instead, the decision to accept the ice-cream or the coffee might be under “straight-through” control by stimulation of the senses by the levels of sugar and of cream (Booth, Conner & Marie, 1987). That is, these two sources of stimulation may be integrated preattentively or subconsciously. Here we call that sort of achievement “SUBceiving.” In this case, the integrative decision pathway operates without going through either of the two states of subjective experiencing. Those two diagrams can be superposed, to put the alternative pathways (channels) onto one graph. Also just one of the features can be considered, albeit still within its context of the whole sensed object (and situation). Changing the example from the sugar and cream in the coffee to the bitterness of its caffeine content, either a SUBceptual process or a PERceptual process might influence the acceptance of a cup of a particular coffee, or a recognition of how good in quality that sample of coffee was:Thirdly, the verbal concept of bitterness of taste might by itself drive the recognition of quality (or the ideal level of caffeine for that drink of coffee) – a CONceptual process. Adding that third possible information-transmitting channel to the combined diagram, we get:Booth, Conner and Gibson (1989) found that subceptual discrimination between levels of caffeine was quite common in preferences among coffees. Each assessor’s Weber-Fechner ratio (JND, HDD) for discrimination between caffeine levels was calculated from ratings of bitterness or acceptance and the concentration of caffeine as described in the previous talk. In about half the people tested on their usual drink of coffee, the ratio was much higher for bitterness than for preference – above the normal distribution that was seen around a (log) ratio of zero, from about 1/10 to 10/1 or so. Number of people at each difference between the ratios is given below. Bitterness minus Preference DRs Discrimination ratio (DR) for bitterness minus DR for preference
منابع مشابه
Application of Combined Local Object Based Features and Cluster Fusion for the Behaviors Recognition and Detection of Abnormal Behaviors
In this paper, we propose a novel framework for behaviors recognition and detection of certain types of abnormal behaviors, capable of achieving high detection rates on a variety of real-life scenes. The new proposed approach here is a combination of the location based methods and the object based ones. First, a novel approach is formulated to use optical flow and binary motion video as the loc...
متن کاملUrban Vegetation Recognition Based on the Decision Level Fusion of Hyperspectral and Lidar Data
Introduction: Information about vegetation cover and their health has always been interesting to ecologists due to its importance in terms of habitat, energy production and other important characteristics of plants on the earth planet. Nowadays, developments in remote sensing technologies caused more remotely sensed data accessible to researchers. The combination of these data improves the obje...
متن کاملSemantic Preserving Data Reduction using Artificial Immune Systems
Artificial Immune Systems (AIS) can be defined as soft computing systems inspired by immune system of vertebrates. Immune system is an adaptive pattern recognition system. AIS have been used in pattern recognition, machine learning, optimization and clustering. Feature reduction refers to the problem of selecting those input features that are most predictive of a given outcome; a problem encoun...
متن کاملCould the configuring of features replace a specialised receptor ?
This paper illustrates how perception is achieved through interactions among the psychophysical functions of judged features of an object. The theory is that the perceiver places processed features in a multidimensional space of discriminal processes. Each dimension is scaled in units of discrimination performance. The zero coordinate of each feature is its level in an internal standard (norm) ...
متن کاملSpace as a Semiotic Object: A Three-Dimensional Model of Vertical Structure of Space in Calvino’s Invisible Cities
Following the “spatial turn” of the last 3 decades in humanities and social sciences and the structure of semiotic object, this research studies space as the main semiotic object of Calvino’s (1972) Invisible Cities. Significance of this application resides in examining the possibility of providing a more concrete methodology based on the integration of Zoran’s (1984) 3 vertical levels of const...
متن کاملObject Recognition based on Local Steering Kernel and SVM
The proposed method is to recognize objects based on application of Local Steering Kernels (LSK) as Descriptors to the image patches. In order to represent the local properties of the images, patch is to be extracted where the variations occur in an image. To find the interest point, Wavelet based Salient Point detector is used. Local Steering Kernel is then applied to the resultant pixels, in ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010